Skip to content

Conversation

@mmha
Copy link
Contributor

@mmha mmha commented Sep 19, 2025

This patchs implements array constructors and destructors for multidimensional arrays. This works by bitcasting the pointer to the first element to a one-dimensional array type of the same extent before lowering to a loop.

This patchs implements array constructors and destructors for multidimensional arrays. This works by bitcasting the pointer to the first element to a one-dimensional array type of the same extent before lowering to a loop.
@mmha mmha requested a review from erichkeane September 19, 2025 17:27
@llvmbot llvmbot added clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project labels Sep 19, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 19, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clangir

Author: Morris Hafner (mmha)

Changes

This patchs implements array constructors and destructors for multidimensional arrays. This works by bitcasting the pointer to the first element to a one-dimensional array type of the same extent before lowering to a loop.


Full diff: https://github.com/llvm/llvm-project/pull/159820.diff

4 Files Affected:

  • (modified) clang/lib/CIR/CodeGen/CIRGenClass.cpp (+10-4)
  • (modified) clang/lib/CIR/CodeGen/CIRGenDecl.cpp (+21-5)
  • (modified) clang/test/CIR/CodeGen/array-ctor.cpp (+69)
  • (modified) clang/test/CIR/CodeGen/array-dtor.cpp (+71)
diff --git a/clang/lib/CIR/CodeGen/CIRGenClass.cpp b/clang/lib/CIR/CodeGen/CIRGenClass.cpp
index 0a8dc2b62fe21..be416f6f69eb0 100644
--- a/clang/lib/CIR/CodeGen/CIRGenClass.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenClass.cpp
@@ -639,12 +639,22 @@ void CIRGenFunction::emitCXXAggrConstructorCall(
   // are probably legitimate places where we could assume that this
   // doesn't happen, but it's not clear that it's worth it.
 
+  auto arrayTy = mlir::cast<cir::ArrayType>(arrayBase.getElementType());
+  mlir::Type elementType = arrayTy.getElementType();
+
+  // This might be a multi-dimensional array. Find the innermost element type.
+  while(auto maybeArrayTy = mlir::dyn_cast<cir::ArrayType>(elementType))
+    elementType = maybeArrayTy.getElementType();
+  cir::PointerType ptrToElmType = builder.getPointerTo(elementType);
+
   // Optimize for a constant count.
   if (auto constantCount = numElements.getDefiningOp<cir::ConstantOp>()) {
     if (auto constIntAttr = constantCount.getValueAttr<cir::IntAttr>()) {
       // Just skip out if the constant count is zero.
       if (constIntAttr.getUInt() == 0)
         return;
+
+      arrayTy = cir::ArrayType::get(elementType, constIntAttr.getUInt());
       // Otherwise, emit the check.
     }
 
@@ -655,10 +665,6 @@ void CIRGenFunction::emitCXXAggrConstructorCall(
     cgm.errorNYI(e->getSourceRange(), "dynamic-length array expression");
   }
 
-  auto arrayTy = mlir::cast<cir::ArrayType>(arrayBase.getElementType());
-  mlir::Type elementType = arrayTy.getElementType();
-  cir::PointerType ptrToElmType = builder.getPointerTo(elementType);
-
   // Tradional LLVM codegen emits a loop here. CIR lowers to a loop as part of
   // LoweringPrepare.
 
diff --git a/clang/lib/CIR/CodeGen/CIRGenDecl.cpp b/clang/lib/CIR/CodeGen/CIRGenDecl.cpp
index d028a98eaad02..65e3e71ad9852 100644
--- a/clang/lib/CIR/CodeGen/CIRGenDecl.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenDecl.cpp
@@ -713,10 +713,11 @@ void CIRGenFunction::pushDestroy(CleanupKind cleanupKind, Address addr,
 /// The array cannot be zero-length.
 ///
 /// \param begin - a type* denoting the first element of the array
-/// \param end - a type* denoting one past the end of the array
+/// \param numElements - the number of elements in the array
 /// \param elementType - the element type of the array
 /// \param destroyer - the function to call to destroy elements
-void CIRGenFunction::emitArrayDestroy(mlir::Value begin, mlir::Value end,
+void CIRGenFunction::emitArrayDestroy(mlir::Value begin,
+                                      mlir::Value numElements,
                                       QualType elementType,
                                       CharUnits elementAlign,
                                       Destroyer *destroyer) {
@@ -727,9 +728,25 @@ void CIRGenFunction::emitArrayDestroy(mlir::Value begin, mlir::Value end,
   mlir::Type cirElementType = convertTypeForMem(elementType);
   cir::PointerType ptrToElmType = builder.getPointerTo(cirElementType);
 
+  uint64_t size = 0;
+
+  // Optimize for a constant array size.
+  if (auto constantCount = numElements.getDefiningOp<cir::ConstantOp>()) {
+    if (auto constIntAttr = constantCount.getValueAttr<cir::IntAttr>()) {
+      size = constIntAttr.getUInt();
+    }
+  } else {
+    cgm.errorNYI(begin.getDefiningOp()->getLoc(),
+                 "dynamic-length array expression");
+  }
+
+  auto arrayTy = cir::ArrayType::get(cirElementType, size);
+  mlir::Value arrayOp = builder.createPtrBitcast(begin, arrayTy);
+
   // Emit the dtor call that will execute for every array element.
   cir::ArrayDtor::create(
-      builder, *currSrcLoc, begin, [&](mlir::OpBuilder &b, mlir::Location loc) {
+      builder, *currSrcLoc, arrayOp,
+      [&](mlir::OpBuilder &b, mlir::Location loc) {
         auto arg = b.getInsertionBlock()->addArgument(ptrToElmType, loc);
         Address curAddr = Address(arg, cirElementType, elementAlign);
         assert(!cir::MissingFeatures::dtorCleanups());
@@ -772,8 +789,7 @@ void CIRGenFunction::emitDestroy(Address addr, QualType type,
     return;
 
   mlir::Value begin = addr.getPointer();
-  mlir::Value end; // This will be used for future non-constant counts.
-  emitArrayDestroy(begin, end, type, elementAlign, destroyer);
+  emitArrayDestroy(begin, length, type, elementAlign, destroyer);
 
   // If the array destroy didn't use the length op, we can erase it.
   if (constantCount.use_empty())
diff --git a/clang/test/CIR/CodeGen/array-ctor.cpp b/clang/test/CIR/CodeGen/array-ctor.cpp
index c373acf0bff8c..bad4868ed8c34 100644
--- a/clang/test/CIR/CodeGen/array-ctor.cpp
+++ b/clang/test/CIR/CodeGen/array-ctor.cpp
@@ -104,3 +104,72 @@ void zero_sized() {
 // OGCG:       alloca [0 x %struct.S]
 // OGCG-NOT:   call void @_ZN1SC1Ev
 // OGCG:       ret void
+
+void multi_dimensional() {
+    S s[3][5];
+}
+
+// CIR-BEFORE-LPP:     cir.func{{.*}} @_Z17multi_dimensionalv()
+// CIR-BEFORE-LPP:       %[[S:.*]] = cir.alloca !cir.array<!cir.array<!rec_S x 5> x 3>, !cir.ptr<!cir.array<!cir.array<!rec_S x 5> x 3>>, ["s", init]
+// CIR-BEFORE-LPP:       %[[FLAT:.*]] = cir.cast(bitcast, %[[S]] : !cir.ptr<!cir.array<!cir.array<!rec_S x 5> x 3>>), !cir.ptr<!cir.array<!rec_S x 15>>
+// CIR-BEFORE-LPP:       cir.array.ctor %[[FLAT]] : !cir.ptr<!cir.array<!rec_S x 15>> {
+// CIR-BEFORE-LPP:        ^bb0(%[[ARG:.*]]: !cir.ptr<!rec_S>):
+// CIR-BEFORE-LPP:          cir.call @_ZN1SC1Ev(%[[ARG]]) : (!cir.ptr<!rec_S>) -> ()
+// CIR-BEFORE-LPP:          cir.yield
+// CIR-BEFORE-LPP:       }
+// CIR-BEFORE-LPP:       cir.return
+
+// CIR:     cir.func{{.*}} @_Z17multi_dimensionalv()
+// CIR:       %[[S:.*]] = cir.alloca !cir.array<!cir.array<!rec_S x 5> x 3>, !cir.ptr<!cir.array<!cir.array<!rec_S x 5> x 3>>, ["s", init]
+// CIR:       %[[CONST15:.*]] = cir.const #cir.int<15> : !u64i
+// CIR:       %[[DECAY:.*]] = cir.cast(array_to_ptrdecay, {{.*}} : !cir.ptr<!cir.array<!rec_S x 15>>), !cir.ptr<!rec_S>
+// CIR:       %[[END_PTR:.*]] = cir.ptr_stride(%[[DECAY]] : !cir.ptr<!rec_S>, %[[CONST15]] : !u64i), !cir.ptr<!rec_S>
+// CIR:       %[[ITER:.*]] = cir.alloca !cir.ptr<!rec_S>, !cir.ptr<!cir.ptr<!rec_S>>, ["__array_idx"]
+// CIR:       cir.store %[[DECAY]], %[[ITER]] : !cir.ptr<!rec_S>, !cir.ptr<!cir.ptr<!rec_S>>
+// CIR:       cir.do {
+// CIR:         %[[CURRENT:.*]] = cir.load %[[ITER]] : !cir.ptr<!cir.ptr<!rec_S>>, !cir.ptr<!rec_S>
+// CIR:         cir.call @_ZN1SC1Ev(%[[CURRENT]]) : (!cir.ptr<!rec_S>) -> ()
+// CIR:         %[[CONST1:.*]] = cir.const #cir.int<1> : !u64i
+// CIR:         %[[NEXT:.*]] = cir.ptr_stride(%[[CURRENT]] : !cir.ptr<!rec_S>, %[[CONST1]] : !u64i), !cir.ptr<!rec_S>
+// CIR:         cir.store %[[NEXT]], %[[ITER]] : !cir.ptr<!rec_S>, !cir.ptr<!cir.ptr<!rec_S>>
+// CIR:         cir.yield
+// CIR:       } while {
+// CIR:         %[[CURRENT2:.*]] = cir.load %[[ITER]] : !cir.ptr<!cir.ptr<!rec_S>>, !cir.ptr<!rec_S>
+// CIR:         %[[CMP:.*]] = cir.cmp(ne, %[[CURRENT2]], %[[END_PTR]]) : !cir.ptr<!rec_S>, !cir.bool
+// CIR:         cir.condition(%[[CMP]])
+// CIR:       }
+// CIR:       cir.return
+
+// LLVM:     define{{.*}} @_Z17multi_dimensionalv()
+// LLVM:       %[[S:.*]] = alloca [3 x [5 x %struct.S]]
+// LLVM:       %[[START:.*]] = getelementptr %struct.S, ptr %[[S]], i32 0
+// LLVM:       %[[END:.*]] = getelementptr %struct.S, ptr %[[START]], i64 15
+// LLVM:       %[[ITER:.*]] = alloca ptr
+// LLVM:       store ptr %[[START]], ptr %[[ITER]]
+// LLVM:       br label %[[LOOP:.*]]
+// LLVM:     [[COND:.*]]:
+// LLVM:       %[[CURRENT_CHECK:.*]] = load ptr, ptr %[[ITER]]
+// LLVM:       %[[DONE:.*]] = icmp ne ptr %[[CURRENT_CHECK]], %[[END]]
+// LLVM:       br i1 %[[DONE]], label %[[LOOP]], label %[[EXIT:.*]]
+// LLVM:     [[LOOP]]:
+// LLVM:       %[[CURRENT:.*]] = load ptr, ptr %[[ITER]]
+// LLVM:       call void @_ZN1SC1Ev(ptr %[[CURRENT]])
+// LLVM:       %[[NEXT:.*]] = getelementptr %struct.S, ptr %[[CURRENT]], i64 1
+// LLVM:       store ptr %[[NEXT]], ptr %[[ITER]]
+// LLVM:       br label %[[COND]]
+// LLVM:     [[EXIT]]:
+// LLVM:       ret void
+
+// OGCG:     define{{.*}} @_Z17multi_dimensionalv()
+// OGCG:       %[[S:.*]] = alloca [3 x [5 x %struct.S]]
+// OGCG:       %[[START:.*]] = getelementptr{{.*}} %struct.S{{.*}}
+// OGCG:       %[[END:.*]] = getelementptr{{.*}} %struct.S{{.*}} i64 15
+// OGCG:       br label %[[LOOP:.*]]
+// OGCG:     [[LOOP]]:
+// OGCG:       %[[CURRENT:.*]] = phi ptr [ %[[START]], %{{.*}} ], [ %[[NEXT:.*]], %[[LOOP]] ]
+// OGCG:       call void @_ZN1SC1Ev(ptr{{.*}})
+// OGCG:       %[[NEXT]] = getelementptr{{.*}} %struct.S{{.*}} i64 1
+// OGCG:       %[[DONE:.*]] = icmp eq ptr %[[NEXT]], %[[END]]
+// OGCG:       br i1 %[[DONE]], label %[[EXIT:.*]], label %[[LOOP]]
+// OGCG:     [[EXIT]]:
+// OGCG:       ret void
diff --git a/clang/test/CIR/CodeGen/array-dtor.cpp b/clang/test/CIR/CodeGen/array-dtor.cpp
index 3edc6f1a6538d..36db265a6dfed 100644
--- a/clang/test/CIR/CodeGen/array-dtor.cpp
+++ b/clang/test/CIR/CodeGen/array-dtor.cpp
@@ -102,3 +102,74 @@ void test_cleanup_zero_length_array() {
 // OGCG:       alloca [0 x %struct.S]
 // OGCG-NOT:   call void @_ZN1SD1Ev
 // OGCG:       ret void
+
+void multi_dimensional() {
+    S s[3][5];
+}
+
+// CIR-BEFORE-LPP:     cir.func{{.*}} @_Z17multi_dimensionalv()
+// CIR-BEFORE-LPP:       %[[S:.*]] = cir.alloca !cir.array<!cir.array<!rec_S x 5> x 3>, !cir.ptr<!cir.array<!cir.array<!rec_S x 5> x 3>>, ["s"]
+// CIR-BEFORE-LPP:       %[[FLAT:.*]] = cir.cast(bitcast, %[[S]] : !cir.ptr<!cir.array<!cir.array<!rec_S x 5> x 3>>), !cir.ptr<!cir.array<!rec_S x 15>>
+// CIR-BEFORE-LPP:       cir.array.dtor %[[FLAT]] : !cir.ptr<!cir.array<!rec_S x 15>> {
+// CIR-BEFORE-LPP:       ^bb0(%[[ARG:.*]]: !cir.ptr<!rec_S>):
+// CIR-BEFORE-LPP:         cir.call @_ZN1SD1Ev(%[[ARG]]) nothrow : (!cir.ptr<!rec_S>) -> ()
+// CIR-BEFORE-LPP:         cir.yield
+// CIR-BEFORE-LPP:       }
+// CIR-BEFORE-LPP:       cir.return
+
+// CIR:     cir.func{{.*}} @_Z17multi_dimensionalv()
+// CIR:       %[[S:.*]] = cir.alloca !cir.array<!cir.array<!rec_S x 5> x 3>, !cir.ptr<!cir.array<!cir.array<!rec_S x 5> x 3>>, ["s"]
+// CIR:       %[[FLAT:.*]] = cir.cast(bitcast, %[[S]] : !cir.ptr<!cir.array<!cir.array<!rec_S x 5> x 3>>), !cir.ptr<!cir.array<!rec_S x 15>>
+// CIR:       %[[CONST14:.*]] = cir.const #cir.int<14> : !u64i
+// CIR:       %[[DECAY:.*]] = cir.cast(array_to_ptrdecay, %[[FLAT]] : !cir.ptr<!cir.array<!rec_S x 15>>), !cir.ptr<!rec_S>
+// CIR:       %[[END_PTR:.*]] = cir.ptr_stride(%[[DECAY]] : !cir.ptr<!rec_S>, %[[CONST14]] : !u64i), !cir.ptr<!rec_S>
+// CIR:       %[[ITER:.*]] = cir.alloca !cir.ptr<!rec_S>, !cir.ptr<!cir.ptr<!rec_S>>, ["__array_idx"]
+// CIR:       cir.store %[[END_PTR]], %[[ITER]] : !cir.ptr<!rec_S>, !cir.ptr<!cir.ptr<!rec_S>>
+// CIR:       cir.do {
+// CIR:         %[[CUR:.*]] = cir.load %[[ITER]] : !cir.ptr<!cir.ptr<!rec_S>>, !cir.ptr<!rec_S>
+// CIR:         cir.call @_ZN1SD1Ev(%[[CUR]]) nothrow : (!cir.ptr<!rec_S>) -> ()
+// CIR:         %[[NEG1:.*]] = cir.const #cir.int<-1> : !s64i
+// CIR:         %[[PREV:.*]] = cir.ptr_stride(%[[CUR]] : !cir.ptr<!rec_S>, %[[NEG1]] : !s64i), !cir.ptr<!rec_S>
+// CIR:         cir.store %[[PREV]], %[[ITER]] : !cir.ptr<!rec_S>, !cir.ptr<!cir.ptr<!rec_S>>
+// CIR:         cir.yield
+// CIR:       } while {
+// CIR:         %[[CHK:.*]] = cir.load %[[ITER]] : !cir.ptr<!cir.ptr<!rec_S>>, !cir.ptr<!rec_S>
+// CIR:         %[[CMP:.*]] = cir.cmp(ne, %[[CHK]], %[[DECAY]])
+// CIR:         cir.condition(%[[CMP]])
+// CIR:       }
+// CIR:       cir.return
+
+// LLVM:     define{{.*}} void @_Z17multi_dimensionalv()
+// LLVM:       %[[S:.*]] = alloca [3 x [5 x %struct.S]]
+// LLVM:       %[[START:.*]] = getelementptr %struct.S, ptr %[[S]], i32 0
+// LLVM:       %[[END:.*]] = getelementptr %struct.S, ptr %[[START]], i64 14
+// LLVM:       %[[ITER:.*]] = alloca ptr
+// LLVM:       store ptr %[[END]], ptr %[[ITER]]
+// LLVM:       br label %[[LOOP:.*]]
+// LLVM: [[COND:.*]]:
+// LLVM:       %[[CURRENT_CHECK:.*]] = load ptr, ptr %[[ITER]]
+// LLVM:       %[[DONE:.*]] = icmp ne ptr %[[CURRENT_CHECK]], %[[START]]
+// LLVM:       br i1 %[[DONE]], label %[[LOOP]], label %[[EXIT:.*]]
+// LLVM: [[LOOP]]:
+// LLVM:       %[[CUR:.*]] = load ptr, ptr %[[ITER]]
+// LLVM:       call void @_ZN1SD1Ev(ptr %[[CUR]])
+// LLVM:       %[[PREV:.*]] = getelementptr %struct.S, ptr %[[CUR]], i64 -1
+// LLVM:       store ptr %[[PREV]], ptr %[[ITER]]
+// LLVM:       br label %[[COND]]
+// LLVM: [[EXIT]]:
+// LLVM:       ret void
+
+// OGCG:     define{{.*}} void @_Z17multi_dimensionalv()
+// OGCG:       %[[ARRAY:.*]] = alloca [3 x [5 x %struct.S]]
+// OGCG:       %[[START:.*]] = getelementptr{{.*}} %struct.S{{.*}}
+// OGCG:       %[[END:.*]] = getelementptr{{.*}} %struct.S{{.*}} i64 15
+// OGCG:       br label %[[LOOP:.*]]
+// OGCG: [[LOOP]]:
+// OGCG:       %[[NEXT:.*]] = phi ptr [ %[[END]], %{{.*}} ], [ %[[LAST:.*]], %[[LOOP]] ]
+// OGCG:       %[[LAST]] = getelementptr{{.*}} %struct.S{{.*}}, ptr %[[NEXT]], i64 -1
+// OGCG:       call void @_ZN1SD1Ev(ptr{{.*}} %[[LAST]])
+// OGCG:       %[[DONE:.*]] = icmp eq ptr %[[LAST]], %[[START]]
+// OGCG:       br i1 %[[DONE]], label %[[EXIT:.*]], label %[[LOOP]]
+// OGCG: [[EXIT]]:
+// OGCG:       ret void
+

@github-actions
Copy link

github-actions bot commented Sep 19, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@mmha
Copy link
Contributor Author

mmha commented Sep 19, 2025

I wonder whether we should immediately flatten multidimensional arrays in CIR instead of casting pointers. Do we need this information for something?

Maybe it's also better for ArrayCtor/ArrayDtor to use a pair of pointers instead of relying on the ArrayType size in LoweringPrepare. That makes easier to implement VLAs later on and it's possible that it could improve code size when generating cleanups in the case of destructors because we don't have to generate a new loop for each element in the array.

@erichkeane
Copy link
Collaborator

I don't have a good idea whether this is correct here, and Morris is on PTO the next 2 weeks, and this blocks me on my next OpenACC task. So hopefully @andykaylor can fixup anything that needs fixing up when he gets back next week, but prompt review would be appreciated!

Copy link
Contributor

@xlauko xlauko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@xlauko
Copy link
Contributor

xlauko commented Sep 19, 2025

I wonder whether we should immediately flatten multidimensional arrays in CIR instead of casting pointers. Do we need this information for something?

Maybe it's also better for ArrayCtor/ArrayDtor to use a pair of pointers instead of relying on the ArrayType size in LoweringPrepare. That makes easier to implement VLAs later on and it's possible that it could improve code size when generating cleanups in the case of destructors because we don't have to generate a new loop for each element in the array.

Not sure about specific usecase, but in general I would be for keeping as much info as we can if does not cost us much.

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I wonder whether we should immediately flatten multidimensional arrays in CIR instead of casting pointers. Do we need this information for something?

Sounds like we might lose some of the type information flow here, however it's possible we might still have all info needed to reconstruct without too much work if needed, how much work does it actually simplify in lowering prepare though? Looks like we can evaluate this in a follow up PR?


// Optimize for a constant array size.
if (auto constantCount = numElements.getDefiningOp<cir::ConstantOp>()) {
if (auto constIntAttr = constantCount.getValueAttr<cir::IntAttr>()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for curlies here

@mmha
Copy link
Contributor Author

mmha commented Sep 19, 2025

I mostly wondered if there's an actual usecase here and it makes array to pointer decays a little awkward (though not too painful IMO).

I think the better approach would be to change the Array structor ops to accept a pair of pointers instead of an array. That makes it simple to add VLA support later on. I can explore such a change in a follow up PR.

@mmha mmha merged commit 7165cf3 into llvm:main Sep 19, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants